An Infrastructure to Mine Molecular Descriptors for Ligand Selection on Virtual Screening

نویسندگان

  • Vinicius Rosa Seus
  • Giovanni Xavier Perazzo
  • Ana T. Winck
  • Adriano V. Werhli
  • Karina S. Machado
چکیده

The receptor-ligand interaction evaluation is one important step in rational drug design. The databases that provide the structures of the ligands are growing on a daily basis. This makes it impossible to test all the ligands for a target receptor. Hence, a ligand selection before testing the ligands is needed. One possible approach is to evaluate a set of molecular descriptors. With the aim of describing the characteristics of promising compounds for a specific receptor we introduce a data warehouse-based infrastructure to mine molecular descriptors for virtual screening (VS). We performed experiments that consider as target the receptor HIV-1 protease and different compounds for this protein. A set of 9 molecular descriptors are taken as the predictive attributes and the free energy of binding is taken as a target attribute. By applying the J48 algorithm over the data we obtain decision tree models that achieved up to 84% of accuracy. The models indicate which molecular descriptors and their respective values are relevant to influence good FEB results. Using their rules we performed ligand selection on ZINC database. Our results show important reduction in ligands selection to be applied in VS experiments; for instance, the best selection model picked only 0.21% of the total amount of drug-like ligands.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Molecular Docking Based on Virtual Screening, Molecular Dynamics and Atoms in Molecules Studies to Identify the Potential Human Epidermal Receptor 2 Intracellular Domain Inhibitors

Human epidermal growth factor receptor 2 (HER2) is a member of the epidermal growth factor receptor family having tyrosine kinase activity. Overexpression of HER2 usually causes malignant transformation of cells and is responsible for the breast cancer. In this work, the virtual screening, molecular docking, quantum mechanics and molecular dynamics methods were employed to study protein–ligand ...

متن کامل

Target-oriented Generic Fingerprint-based Molecular Representation

The screening of chemical libraries is an important step in the drug discovery process. The existing chemical libraries contain up to millions of compounds. As the screening at such scale is expensive, the virtual screening is often utilized. There exist several variants of virtual screening and ligand-based virtual screening is one of them. It utilizes the similarity of screened chemical compo...

متن کامل

A Discussion of Measures of Enrichment in Virtual Screening: Comparing the Information Content of Descriptors with Increasing Levels of Sophistication

We have performed virtual screening using some very simple features, by employing the number of atoms per element as molecular descriptors but without regard to any structural information whatsoever. Surprisingly, these atom counts are able to outperform virtual-affinity-based fingerprints and Unity fingerprints in some activity classes. Although molecular weight and other biases were known in ...

متن کامل

SVM-Based Feature Selection for Characterization of Focused Compound Collections

Artificial neural networks, the support vector machine (SVM), and other machine learning methods for the classification of molecules are often considered as a "black box", since the molecular features that are most relevant for a given classifier are usually not presented in a human-interpretable form. We report on an SVM-based algorithm for the selection of relevant molecular features from a t...

متن کامل

The influence of protonation in protein-ligand docking

With the use in Virtual Screening (VS) in experiments Protein-Ligand-Docking has gained more and more importance in pharmaceutical research over the past years. To model the interactions between the protein and a ligand empirical scoring functions are used in many programs. These scoring functions consist of different terms, which describe physical and chemical properties important for an attra...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 2014  شماره 

صفحات  -

تاریخ انتشار 2014